In the initial phase of my research project, Census Tract Selection, I selected six census tracts within Chittenden County—three with high levels of vehicle access, three with low levels—using estimates from the U.S. Census Bureau’s American Community Survey (ACS).
Estimates are accompanied by a margin of error (MOE)—“a measure of the possible variation of the estimate around the population value”—calculated at a 90% confidence level, the Census Bureau’s default. (Fuller & U.S. Census Bureau, 2016)
As discussed in my proposal, this way of grouping creates two distinct categories for the examination and comparison of differences in aesthetic characteristics, and acts as an empirical selection method to avoid bias.
To acquire the skills necessary to appropriately form these groups, I completed the course Analyzing US Census Data in R published by DataCamp, an online learning platform.
In the Data_Collection.R script below, the tidycensus R library is used to gather and organize data on Census Tracts in Chittenden County, VT from the 2022 ACS 5-year estimates for Table B08141: Means of Transportation to Work by Vehicles Available for workers 16 years and over in households.
Within Table B08141, variables B08141_002 (“No vehicle available”), B08141_003 (“1 vehicle available”), B08141_004 (“2 vehicles available”), and B08141_005 (“3 or more vehicles available”) were used to estimate the distribution of vehicle access levels among the universe for each Census Tract.
A summed MOE was calculated using the TidyCensus function moe_sum() and combined with the organized data to render a master datatable: chittenden_county_long_moe
Note: As Census Tract 9800 (BTV - Burlington International Airport) has a population of zero, its estimates were not included.
#load tidycensus, tidyverse, and data.table libraries
library(tidycensus)
library(tidyverse)
library(data.table)
#define variables
vehicle_vars <- c(no_vehicles = "B08141_002", one_vehicle = "B08141_003", two_vehicles = "B08141_004", three_plus_vehicles = "B08141_005")
#call data (ACS 2022 5-year estimate)
chittenden_county <- get_acs(geography = "tract", variables = vehicle_vars, year = 2022, state = "VT", county = "Chittenden County", survey = "acs5")
## Getting data from the 2018-2022 5-year ACS
#delete entries for Census Tract 9800 (rows 161-164)
chittenden_county <- chittenden_county[-c(161:164), ]
#create columns for variables; combine rows of same Census Tract; delete MOE column
chittenden_county_long <- dcast(chittenden_county, GEOID + NAME ~ variable, value.var = c("estimate"))
#reorder rows by GEOID
chittenden_county_long <- arrange(chittenden_county_long, GEOID)
#reorder columns
chittenden_county_long <- chittenden_county_long[,c(1,2,3,4,6,5)]
#create dataframe with summed moe for each Census Tract (grouped by GEOID)
moe <- chittenden_county %>% group_by(GEOID) %>% summarize(MOE_GROUP_CT = moe_sum(moe = moe, estimate = estimate))
#combine chittenden_county_long and moe dataframes
chittenden_county_long_moe <- chittenden_county_long %>% mutate(MOE_GROUP_CT = moe$MOE_GROUP_CT)
| GEOID | NAME | no_vehicles | one_vehicle | two_vehicles | three_plus_vehicles | MOE_GROUP_CT |
|---|---|---|---|---|---|---|
| 50007000100 | Census Tract 1; Chittenden County; Vermont | 114 | 537 | 1270 | 547 | 337.7188 |
| 50007000200 | Census Tract 2; Chittenden County; Vermont | 14 | 721 | 1688 | 473 | 535.2635 |
| 50007000300 | Census Tract 3; Chittenden County; Vermont | 250 | 1532 | 880 | 367 | 477.3437 |
| 50007000600 | Census Tract 6; Chittenden County; Vermont | 111 | 327 | 1147 | 1086 | 581.0508 |
| 50007000800 | Census Tract 8; Chittenden County; Vermont | 38 | 375 | 805 | 263 | 329.7120 |
| 50007000900 | Census Tract 9; Chittenden County; Vermont | 37 | 379 | 578 | 410 | 397.9987 |
| 50007001000 | Census Tract 10; Chittenden County; Vermont | 86 | 687 | 733 | 33 | 406.1551 |
| 50007001100 | Census Tract 11; Chittenden County; Vermont | 52 | 403 | 500 | 372 | 332.1295 |
| 50007002101 | Census Tract 21.01; Chittenden County; Vermont | 0 | 92 | 803 | 867 | 489.8806 |
| 50007002103 | Census Tract 21.03; Chittenden County; Vermont | 0 | 179 | 1434 | 785 | 452.6323 |
| 50007002104 | Census Tract 21.04; Chittenden County; Vermont | 0 | 302 | 1229 | 362 | 397.0000 |
| 50007002201 | Census Tract 22.01; Chittenden County; Vermont | 0 | 438 | 205 | 87 | 215.7151 |
| 50007002202 | Census Tract 22.02; Chittenden County; Vermont | 53 | 1106 | 1533 | 459 | 538.6984 |
| 50007002301 | Census Tract 23.01; Chittenden County; Vermont | 0 | 35 | 345 | 330 | 181.5847 |
| 50007002303 | Census Tract 23.03; Chittenden County; Vermont | 98 | 564 | 1299 | 785 | 506.2243 |
| 50007002304 | Census Tract 23.04; Chittenden County; Vermont | 17 | 523 | 796 | 294 | 383.7095 |
| 50007002400 | Census Tract 24; Chittenden County; Vermont | 266 | 531 | 1193 | 259 | 462.6867 |
| 50007002501 | Census Tract 25.01; Chittenden County; Vermont | 0 | 280 | 609 | 580 | 467.5746 |
| 50007002502 | Census Tract 25.02; Chittenden County; Vermont | 67 | 923 | 307 | 280 | 573.1030 |
| 50007002601 | Census Tract 26.01; Chittenden County; Vermont | 188 | 1037 | 1510 | 706 | 552.9756 |
| 50007002602 | Census Tract 26.02; Chittenden County; Vermont | 66 | 648 | 1301 | 668 | 398.6427 |
| 50007002701 | Census Tract 27.01; Chittenden County; Vermont | 250 | 555 | 1442 | 989 | 584.3886 |
| 50007002702 | Census Tract 27.02; Chittenden County; Vermont | 0 | 279 | 1806 | 685 | 480.3686 |
| 50007002800 | Census Tract 28; Chittenden County; Vermont | 0 | 369 | 1293 | 1164 | 415.1169 |
| 50007002900 | Census Tract 29; Chittenden County; Vermont | 106 | 467 | 1950 | 1127 | 445.7970 |
| 50007003000 | Census Tract 30; Chittenden County; Vermont | 0 | 350 | 1390 | 715 | 346.9597 |
| 50007003101 | Census Tract 31.01; Chittenden County; Vermont | 11 | 722 | 2541 | 1301 | 711.3044 |
| 50007003102 | Census Tract 31.02; Chittenden County; Vermont | 0 | 111 | 489 | 516 | 303.2656 |
| 50007003301 | Census Tract 33.01; Chittenden County; Vermont | 19 | 318 | 1230 | 652 | 360.2291 |
| 50007003304 | Census Tract 33.04; Chittenden County; Vermont | 27 | 1011 | 2036 | 839 | 761.5314 |
| 50007003401 | Census Tract 34.01; Chittenden County; Vermont | 29 | 527 | 1303 | 1010 | 552.1322 |
| 50007003402 | Census Tract 34.02; Chittenden County; Vermont | 25 | 117 | 613 | 222 | 300.6626 |
| 50007003501 | Census Tract 35.01; Chittenden County; Vermont | 23 | 355 | 950 | 838 | 418.0132 |
| 50007003502 | Census Tract 35.02; Chittenden County; Vermont | 54 | 363 | 1712 | 738 | 465.9893 |
| 50007003503 | Census Tract 35.03; Chittenden County; Vermont | 0 | 129 | 545 | 355 | 153.2286 |
| 50007003600 | Census Tract 36; Chittenden County; Vermont | 0 | 639 | 2113 | 367 | 666.6326 |
| 50007003900 | Census Tract 39; Chittenden County; Vermont | 72 | 399 | 380 | 205 | 255.2842 |
| 50007004002 | Census Tract 40.02; Chittenden County; Vermont | 35 | 560 | 1220 | 685 | 474.0612 |
| 50007004100 | Census Tract 41; Chittenden County; Vermont | 123 | 341 | 608 | 340 | 447.7298 |
| 50007004200 | Census Tract 42; Chittenden County; Vermont | 251 | 1021 | 799 | 784 | 545.0183 |
To most accurately assess vehicle access by tract, two scripts—each providing a distinct measure of vehicle access—were assembled. The first of these, PCT_0.R, creates a variable (PCT_0) that represents the estimate for “No vehicle available” as a percentage of the total estimate count for each Census Tract.
As the practical difference between having no vehicle and having one vehicle is much greater than that between having one vehicle and having two, and so forth, PCT_0 is an effective—albeit limited—measure of vehicle access. In other words, the proportion of the population (workers 16 years and over in households) in a given Census Tract who lack access to a vehicle is a relevant indicator of that tract’s overall vehicle access level.
In addition to the creation of the PCT_0 variable, the MOE for each tract’s PCT_0 value is calculated and added to the pct_0 dataframe.
#instalize dataframe for pct_0
pct_0 <- data.frame()
#find sums of counts for each Census Tract
countsum <- rowSums(chittenden_county_long_moe[,3:6])
#create new column in pct_0 with percent of tract count from no_vehicles
for (i in 1:40) {
pct_0[i,1] = 100 * (chittenden_county_long_moe[i,3] / countsum[i])
}
#name new column "PCT_0"
names(pct_0) <- "PCT_0"
#add column for GEOIDs
pct_0 <- pct_0 %>% mutate(GEOID = chittenden_county_long_moe$GEOID)
#add column for Census Tract names
pct_0 <- pct_0 %>% mutate(NAME = chittenden_county_long_moe$NAME)
#reorder columns
pct_0 <- pct_0[,c(2,3,1)]
#create new column in moe (moe for no_vehicles count)
for (i in seq(0,159,by=4)) {
moe[(i/4)+1,3] = chittenden_county[i+1,5]
}
#name new column "MOE_0_CT"
colnames(moe)[3] <- "MOE_0_CT"
#create new column in pct_0 with moe for percent of tract count from no_vehicles
for (i in 1:40) {
pct_0[i,4] = 100 * (moe_prop(num = chittenden_county_long_moe[i,3], denom = countsum[i], moe_num = moe[i,3], moe_denom = chittenden_county_long_moe[i,7]))
}
#name new column "MOE_0_PCT"
colnames(pct_0)[4] <- "MOE_0_PCT"
| GEOID | NAME | PCT_0 | MOE_0_PCT |
|---|---|---|---|
| 50007000100 | Census Tract 1; Chittenden County; Vermont | 4.6191248 | 4.2072418 |
| 50007000200 | Census Tract 2; Chittenden County; Vermont | 0.4834254 | 0.6499646 |
| 50007000300 | Census Tract 3; Chittenden County; Vermont | 8.2535490 | 5.7307751 |
| 50007000600 | Census Tract 6; Chittenden County; Vermont | 4.1557469 | 3.5172737 |
| 50007000800 | Census Tract 8; Chittenden County; Vermont | 2.5658339 | 3.8061245 |
| 50007000900 | Census Tract 9; Chittenden County; Vermont | 2.6353276 | 3.1168517 |
| 50007001000 | Census Tract 10; Chittenden County; Vermont | 5.5880442 | 3.8186984 |
| 50007001100 | Census Tract 11; Chittenden County; Vermont | 3.9186134 | 4.2592998 |
| 50007002101 | Census Tract 21.01; Chittenden County; Vermont | 0.0000000 | 0.5675369 |
| 50007002103 | Census Tract 21.03; Chittenden County; Vermont | 0.0000000 | 0.4170142 |
| 50007002104 | Census Tract 21.04; Chittenden County; Vermont | 0.0000000 | 0.5282620 |
| 50007002201 | Census Tract 22.01; Chittenden County; Vermont | 0.0000000 | 1.3698630 |
| 50007002202 | Census Tract 22.02; Chittenden County; Vermont | 1.6820057 | 2.5544783 |
| 50007002301 | Census Tract 23.01; Chittenden County; Vermont | 0.0000000 | 1.4084507 |
| 50007002303 | Census Tract 23.03; Chittenden County; Vermont | 3.5688274 | 4.3570167 |
| 50007002304 | Census Tract 23.04; Chittenden County; Vermont | 1.0429448 | 1.2021800 |
| 50007002400 | Census Tract 24; Chittenden County; Vermont | 11.8274789 | 5.6813816 |
| 50007002501 | Census Tract 25.01; Chittenden County; Vermont | 0.0000000 | 0.6807352 |
| 50007002502 | Census Tract 25.02; Chittenden County; Vermont | 4.2485732 | 4.3639585 |
| 50007002601 | Census Tract 26.01; Chittenden County; Vermont | 5.4635280 | 4.2401886 |
| 50007002602 | Census Tract 26.02; Chittenden County; Vermont | 2.4599329 | 3.7466572 |
| 50007002701 | Census Tract 27.01; Chittenden County; Vermont | 7.7255871 | 7.7556096 |
| 50007002702 | Census Tract 27.02; Chittenden County; Vermont | 0.0000000 | 0.5054152 |
| 50007002800 | Census Tract 28; Chittenden County; Vermont | 0.0000000 | 0.4953999 |
| 50007002900 | Census Tract 29; Chittenden County; Vermont | 2.9041096 | 2.9651621 |
| 50007003000 | Census Tract 30; Chittenden County; Vermont | 0.0000000 | 0.4073320 |
| 50007003101 | Census Tract 31.01; Chittenden County; Vermont | 0.2404372 | 0.4136147 |
| 50007003102 | Census Tract 31.02; Chittenden County; Vermont | 0.0000000 | 0.8960573 |
| 50007003301 | Census Tract 33.01; Chittenden County; Vermont | 0.8562416 | 1.3900933 |
| 50007003304 | Census Tract 33.04; Chittenden County; Vermont | 0.6900077 | 1.1421456 |
| 50007003401 | Census Tract 34.01; Chittenden County; Vermont | 1.0108052 | 1.2043281 |
| 50007003402 | Census Tract 34.02; Chittenden County; Vermont | 2.5588536 | 3.7043290 |
| 50007003501 | Census Tract 35.01; Chittenden County; Vermont | 1.0618652 | 1.6958812 |
| 50007003502 | Census Tract 35.02; Chittenden County; Vermont | 1.8835019 | 2.6331180 |
| 50007003503 | Census Tract 35.03; Chittenden County; Vermont | 0.0000000 | 0.9718173 |
| 50007003600 | Census Tract 36; Chittenden County; Vermont | 0.0000000 | 0.3206156 |
| 50007003900 | Census Tract 39; Chittenden County; Vermont | 6.8181818 | 6.4205944 |
| 50007004002 | Census Tract 40.02; Chittenden County; Vermont | 1.4000000 | 2.2242130 |
| 50007004100 | Census Tract 41; Chittenden County; Vermont | 8.7110482 | 6.2899116 |
| 50007004200 | Census Tract 42; Chittenden County; Vermont | 8.7915937 | 4.0817367 |
To account for variations between tracts in categories other than “No vehicles available”, a second variable (wAvg) was calculated to represent the average quantity of vehicles available for each Census Tract.
wAvg was derived by finding the proportion of the total estimate count represented by each category’s estimate, weighting these proportions by their categories’ implied values (ex. “One vehicle available” -> 1), and summing weighted proportions to render a weighted average for each tract.
#instalize dataframe for weighted_avg
weighted_avg <- data.frame()
#create new column in weighted_avg with weighted average of vehicles available by tract
for (i in 1:40) {
weighted_avg[i,1] = ((chittenden_county_long_moe[i,3] / countsum[i])*0) + ((chittenden_county_long_moe[i,4] / countsum[i])*1) + ((chittenden_county_long_moe[i,5] / countsum[i])*2) + ((chittenden_county_long_moe[i,6] / countsum[i])*3)
}
#name new column "wAvg"
names(weighted_avg) <- "wAvg"
#add GEOIDs and Census Tract names, organize columns
#create new column in weighted_avg with GEOIDs
weighted_avg <- weighted_avg %>% mutate(GEOID = chittenden_county_long_moe$GEOID)
#create new column in weighted_avg with Census Tract names
weighted_avg <- weighted_avg %>% mutate(NAME = chittenden_county_long_moe$NAME)
#reorder columns
weighted_avg <- weighted_avg[,c(2,3,1)]
| GEOID | NAME | wAvg |
|---|---|---|
| 50007000100 | Census Tract 1; Chittenden County; Vermont | 1.911669 |
| 50007000200 | Census Tract 2; Chittenden County; Vermont | 1.904696 |
| 50007000300 | Census Tract 3; Chittenden County; Vermont | 1.450314 |
| 50007000600 | Census Tract 6; Chittenden County; Vermont | 2.201048 |
| 50007000800 | Census Tract 8; Chittenden County; Vermont | 1.873059 |
| 50007000900 | Census Tract 9; Chittenden County; Vermont | 1.969373 |
| 50007001000 | Census Tract 10; Chittenden County; Vermont | 1.463288 |
| 50007001100 | Census Tract 11; Chittenden County; Vermont | 1.898267 |
| 50007002101 | Census Tract 21.01; Chittenden County; Vermont | 2.439841 |
| 50007002103 | Census Tract 21.03; Chittenden County; Vermont | 2.252711 |
| 50007002104 | Census Tract 21.04; Chittenden County; Vermont | 2.031696 |
| 50007002201 | Census Tract 22.01; Chittenden County; Vermont | 1.519178 |
| 50007002202 | Census Tract 22.02; Chittenden County; Vermont | 1.761028 |
| 50007002301 | Census Tract 23.01; Chittenden County; Vermont | 2.415493 |
| 50007002303 | Census Tract 23.03; Chittenden County; Vermont | 2.009104 |
| 50007002304 | Census Tract 23.04; Chittenden County; Vermont | 1.838650 |
| 50007002400 | Census Tract 24; Chittenden County; Vermont | 1.642508 |
| 50007002501 | Census Tract 25.01; Chittenden County; Vermont | 2.204221 |
| 50007002502 | Census Tract 25.02; Chittenden County; Vermont | 1.507292 |
| 50007002601 | Census Tract 26.01; Chittenden County; Vermont | 1.794536 |
| 50007002602 | Census Tract 26.02; Chittenden County; Vermont | 1.958256 |
| 50007002701 | Census Tract 27.01; Chittenden County; Vermont | 1.979604 |
| 50007002702 | Census Tract 27.02; Chittenden County; Vermont | 2.146570 |
| 50007002800 | Census Tract 28; Chittenden County; Vermont | 2.281316 |
| 50007002900 | Census Tract 29; Chittenden County; Vermont | 2.122740 |
| 50007003000 | Census Tract 30; Chittenden County; Vermont | 2.148676 |
| 50007003101 | Census Tract 31.01; Chittenden County; Vermont | 2.121749 |
| 50007003102 | Census Tract 31.02; Chittenden County; Vermont | 2.362903 |
| 50007003301 | Census Tract 33.01; Chittenden County; Vermont | 2.133393 |
| 50007003304 | Census Tract 33.04; Chittenden County; Vermont | 1.942244 |
| 50007003401 | Census Tract 34.01; Chittenden County; Vermont | 2.148135 |
| 50007003402 | Census Tract 34.02; Chittenden County; Vermont | 2.056295 |
| 50007003501 | Census Tract 35.01; Chittenden County; Vermont | 2.201754 |
| 50007003502 | Census Tract 35.02; Chittenden County; Vermont | 2.093129 |
| 50007003503 | Census Tract 35.03; Chittenden County; Vermont | 2.219631 |
| 50007003600 | Census Tract 36; Chittenden County; Vermont | 1.912793 |
| 50007003900 | Census Tract 39; Chittenden County; Vermont | 1.679924 |
| 50007004002 | Census Tract 40.02; Chittenden County; Vermont | 2.022000 |
| 50007004100 | Census Tract 41; Chittenden County; Vermont | 1.825071 |
| 50007004200 | Census Tract 42; Chittenden County; Vermont | 1.741156 |
Now having created two distinct measures of vehicle access (PCT_0 and wAvg), both variables were normalized and combined to form a index score (Vehicle_Access) that represents each tract’s relative level of vehicle access.
Variables PCT_0 and wAvg were normalized using the min-max scaling method, where data values are scaled between a range of 0 to 1, via the preProcess() and predict() functions in the caret R library. As the value of PCT_0 has a negative relationship with vehicle access, its normalized values were multiplied by -1 when combined with wAvg’s normalized values to form Vehicle_Access.
PCT_0 and wAvg (un-normalized) were then used as primary axes in a scatterplot (created using the ggplot2 and plotly R libraries) where the color of each point (representing a given tract) corresponds to that point’s Vehicle Access Score (Vehicle_Access).
#load ggplot2, plotly, and caret libraries
library(caret)
library(ggplot2)
library(plotly)
#join variable 1 and variable 2 in new datatable
var1_var2_join <- left_join(pct_0, weighted_avg, by = "GEOID")
#remove NAME.y column
var1_var2_join <- var1_var2_join %>% mutate(NAME.y = NULL)
#rename GEOID and NAME.x column
colnames(var1_var2_join)[1] <- "GEOID"
colnames(var1_var2_join)[2] <- "NAME"
#normalize (rescale 0:1) variable 1
PCT_0_process <- preProcess(as.data.frame(var1_var2_join$PCT_0), method=c("range"))
PCT_0_norm <- predict(PCT_0_process, as.data.frame(var1_var2_join$PCT_0))
#normalize (rescale 0:1) variable 2
wAvg_process <- preProcess(as.data.frame(var1_var2_join$wAvg), method=c("range"))
wAvg_norm <- predict(wAvg_process, as.data.frame(var1_var2_join$wAvg))
#create vehicle access score using normalized variables
for(i in 1:40) {
var1_var2_join[i,6] = (-1 * PCT_0_norm[i,1]) + wAvg_norm[i,1]
}
colnames(var1_var2_join)[6] <- "Vehicle_Access"
| GEOID | NAME | PCT_0 | MOE_0_PCT | wAvg | Vehicle_Access |
|---|---|---|---|---|---|
| 50007000100 | Census Tract 1; Chittenden County; Vermont | 4.6191248 | 4.2072418 | 1.911669 | 0.0756966 |
| 50007000200 | Census Tract 2; Chittenden County; Vermont | 0.4834254 | 0.6499646 | 1.904696 | 0.4183183 |
| 50007000300 | Census Tract 3; Chittenden County; Vermont | 8.2535490 | 5.7307751 | 1.450314 | -0.6978283 |
| 50007000600 | Census Tract 6; Chittenden County; Vermont | 4.1557469 | 3.5172737 | 2.201048 | 0.4073163 |
| 50007000800 | Census Tract 8; Chittenden County; Vermont | 2.5658339 | 3.8061245 | 1.873059 | 0.2102808 |
| 50007000900 | Census Tract 9; Chittenden County; Vermont | 2.6353276 | 3.1168517 | 1.969373 | 0.3017390 |
| 50007001000 | Census Tract 10; Chittenden County; Vermont | 5.5880442 | 3.8186984 | 1.463288 | -0.4593513 |
| 50007001100 | Census Tract 11; Chittenden County; Vermont | 3.9186134 | 4.2592998 | 1.898267 | 0.1213796 |
| 50007002101 | Census Tract 21.01; Chittenden County; Vermont | 0.0000000 | 0.5675369 | 2.439841 | 1.0000000 |
| 50007002103 | Census Tract 21.03; Chittenden County; Vermont | 0.0000000 | 0.4170142 | 2.252711 | 0.8108890 |
| 50007002104 | Census Tract 21.04; Chittenden County; Vermont | 0.0000000 | 0.5282620 | 2.031696 | 0.5875351 |
| 50007002201 | Census Tract 22.01; Chittenden County; Vermont | 0.0000000 | 1.3698630 | 1.519178 | 0.0695933 |
| 50007002202 | Census Tract 22.02; Chittenden County; Vermont | 1.6820057 | 2.5544783 | 1.761028 | 0.1717913 |
| 50007002301 | Census Tract 23.01; Chittenden County; Vermont | 0.0000000 | 1.4084507 | 2.415493 | 0.9753942 |
| 50007002303 | Census Tract 23.03; Chittenden County; Vermont | 3.5688274 | 4.3570167 | 2.009104 | 0.2629641 |
| 50007002304 | Census Tract 23.04; Chittenden County; Vermont | 1.0429448 | 1.2021800 | 1.838650 | 0.3042668 |
| 50007002400 | Census Tract 24; Chittenden County; Vermont | 11.8274789 | 5.6813816 | 1.642508 | -0.8057718 |
| 50007002501 | Census Tract 25.01; Chittenden County; Vermont | 0.0000000 | 0.6807352 | 2.204221 | 0.7618858 |
| 50007002502 | Census Tract 25.02; Chittenden County; Vermont | 4.2485732 | 4.3639585 | 1.507292 | -0.3016304 |
| 50007002601 | Census Tract 26.01; Chittenden County; Vermont | 5.4635280 | 4.2401886 | 1.794536 | -0.1140693 |
| 50007002602 | Census Tract 26.02; Chittenden County; Vermont | 2.4599329 | 3.7466572 | 1.958256 | 0.3053332 |
| 50007002701 | Census Tract 27.01; Chittenden County; Vermont | 7.7255871 | 7.7556096 | 1.979604 | -0.1182972 |
| 50007002702 | Census Tract 27.02; Chittenden County; Vermont | 0.0000000 | 0.5054152 | 2.146570 | 0.7036255 |
| 50007002800 | Census Tract 28; Chittenden County; Vermont | 0.0000000 | 0.4953999 | 2.281316 | 0.8397975 |
| 50007002900 | Census Tract 29; Chittenden County; Vermont | 2.9041096 | 2.9651621 | 2.122740 | 0.4340034 |
| 50007003000 | Census Tract 30; Chittenden County; Vermont | 0.0000000 | 0.4073320 | 2.148676 | 0.7057536 |
| 50007003101 | Census Tract 31.01; Chittenden County; Vermont | 0.2404372 | 0.4136147 | 2.121749 | 0.6582124 |
| 50007003102 | Census Tract 31.02; Chittenden County; Vermont | 0.0000000 | 0.8960573 | 2.362903 | 0.9222479 |
| 50007003301 | Census Tract 33.01; Chittenden County; Vermont | 0.8562416 | 1.3900933 | 2.133393 | 0.6179148 |
| 50007003304 | Census Tract 33.04; Chittenden County; Vermont | 0.6900077 | 1.1421456 | 1.942244 | 0.4387971 |
| 50007003401 | Census Tract 34.01; Chittenden County; Vermont | 1.0108052 | 1.2043281 | 2.148135 | 0.6197445 |
| 50007003402 | Census Tract 34.02; Chittenden County; Vermont | 2.5588536 | 3.7043290 | 2.056295 | 0.3960463 |
| 50007003501 | Census Tract 35.01; Chittenden County; Vermont | 1.0618652 | 1.6958812 | 2.201754 | 0.6696140 |
| 50007003502 | Census Tract 35.02; Chittenden County; Vermont | 1.8835019 | 2.6331180 | 2.093129 | 0.4903703 |
| 50007003503 | Census Tract 35.03; Chittenden County; Vermont | 0.0000000 | 0.9718173 | 2.219631 | 0.7774590 |
| 50007003600 | Census Tract 36; Chittenden County; Vermont | 0.0000000 | 0.3206156 | 1.912793 | 0.4673735 |
| 50007003900 | Census Tract 39; Chittenden County; Vermont | 6.8181818 | 6.4205944 | 1.679924 | -0.3444289 |
| 50007004002 | Census Tract 40.02; Chittenden County; Vermont | 1.4000000 | 2.2242130 | 2.022000 | 0.4593683 |
| 50007004100 | Census Tract 41; Chittenden County; Vermont | 8.7110482 | 6.2899116 | 1.825071 | -0.3577859 |
| 50007004200 | Census Tract 42; Chittenden County; Vermont | 8.7915937 | 4.0817367 | 1.741156 | -0.4493990 |
#graph pct_0 vs wAvg (w/ Vehicle_Access)
var1_var2 <- ggplot(var1_var2_join, aes(x = PCT_0, y = wAvg, color = Vehicle_Access, label = NAME)) + geom_point(size = 2) + labs(x = "% No Vehicles", y = "Average # of Vehicles", color = "Vehicle Access Score", title = "Measures of Vehicle Access by Census Tract") + theme(margin(l = 5))
ggplotly(var1_var2) %>%
layout(margin = list(r = 15), title = list(text = paste0('Measures of Vehicle Access by Census Tract in Chittenden County, VT',
'<br>',
'<sup>',
'Data source: 2017-2022 ACS. Data acquired with the R tidycensus package.','</sup>')))
With the vehicle access of Chittenden County, VT Census Tracts graphically represented, outliers on either extreme may be qualitatively selected To ensure the accurate selection of high and low vehicle access groups, however, some further steps are necessary.
First, the base R function quantitle() was used to identify datapoints in the upper and lower 10% of the distribution of Vehicle_Access values.
#use quantile() function to select tracts (high Vehicle_Access & low Vehicle_Access)
#upper 10% of distribution (High)
var1_var2_join[which(var1_var2_join$Vehicle_Access > quantile(var1_var2_join$Vehicle_Access,.9)),]
## GEOID NAME PCT_0 MOE_0_PCT
## 9 50007002101 Census Tract 21.01; Chittenden County; Vermont 0 0.5675369
## 14 50007002301 Census Tract 23.01; Chittenden County; Vermont 0 1.4084507
## 24 50007002800 Census Tract 28; Chittenden County; Vermont 0 0.4953999
## 28 50007003102 Census Tract 31.02; Chittenden County; Vermont 0 0.8960573
## wAvg Vehicle_Access
## 9 2.439841 1.0000000
## 14 2.415493 0.9753942
## 24 2.281316 0.8397975
## 28 2.362903 0.9222479
#lower 10% of distribution (Low)
var1_var2_join[which(var1_var2_join$Vehicle_Access < quantile(var1_var2_join$Vehicle_Access,.1)),]
## GEOID NAME PCT_0 MOE_0_PCT
## 3 50007000300 Census Tract 3; Chittenden County; Vermont 8.253549 5.730775
## 7 50007001000 Census Tract 10; Chittenden County; Vermont 5.588044 3.818698
## 17 50007002400 Census Tract 24; Chittenden County; Vermont 11.827479 5.681382
## 40 50007004200 Census Tract 42; Chittenden County; Vermont 8.791594 4.081737
## wAvg Vehicle_Access
## 3 1.450314 -0.6978283
## 7 1.463288 -0.4593513
## 17 1.642508 -0.8057718
## 40 1.741156 -0.4493990
This process yielded two groups (high and low vehicle access) of 4 Census Tracts based on their Vehicle Access Score (Vehicle_Access). However, these groups must ultimately contain only 3 tracts each.
To further refine the selection process, a condition was set that the estimate counts for all categories in a given tract must exceed those categories’ corresponding MOEs. This condition was formulated based on the Chittenden County Regional Planning Commission’s (CCRPC) 2018 guide, Best Practices for Reporting American Community Survey in Municipal Planning, which recommends avoiding the use of data where “the MOE for [a] population is higher than the estimate itself” (CCRPC, 2018).
Using the chittenden_county dataframe, the condition was checked for the 8 tracts selected in the previous step, with “YES” or “NO” printed for each category in each tract.
Note: As all tracts in the high vehicle access group had an estimate of 0 for the “No vehicle available” category (resulting in a MOE of 10), the above condition was only assessed on the other three categories for these tracts.
#check that estimate > MOE for selected tracts (using chittenden_county)
#Low group
#Census Tract 3
for(i in 9:12) {
if(chittenden_county$moe[i] < chittenden_county$estimate[i]) {
print("YES")
}
else if(chittenden_county$moe[i] >= chittenden_county$estimate[i]) {
print("NO")
}
}
## [1] "YES"
## [1] "YES"
## [1] "YES"
## [1] "YES"
#Census Tract 10
for(i in 25:28) {
if(chittenden_county$moe[i] < chittenden_county$estimate[i]) {
print("YES")
}
else if(chittenden_county$moe[i] >= chittenden_county$estimate[i]) {
print("NO")
}
}
## [1] "YES"
## [1] "YES"
## [1] "YES"
## [1] "NO"
#Census Tract 24
for(i in 65:68) {
if(chittenden_county$moe[i] < chittenden_county$estimate[i]) {
print("YES")
}
else if(chittenden_county$moe[i] >= chittenden_county$estimate[i]) {
print("NO")
}
}
## [1] "YES"
## [1] "YES"
## [1] "YES"
## [1] "YES"
#Census Tract 42
for(i in 157:160) {
if(chittenden_county$moe[i] < chittenden_county$estimate[i]) {
print("YES")
}
else if(chittenden_county$moe[i] >= chittenden_county$estimate[i]) {
print("NO")
}
}
## [1] "YES"
## [1] "YES"
## [1] "YES"
## [1] "YES"
#High group (ignore no_vehicles variable)
#Census Tract 21.01
for(i in 34:36) {
if(chittenden_county$moe[i] < chittenden_county$estimate[i]) {
print("YES")
}
else if(chittenden_county$moe[i] >= chittenden_county$estimate[i]) {
print("NO")
}
}
## [1] "YES"
## [1] "YES"
## [1] "YES"
#Census Tract 23.01
for(i in 54:56) {
if(chittenden_county$moe[i] < chittenden_county$estimate[i]) {
print("YES")
}
else if(chittenden_county$moe[i] >= chittenden_county$estimate[i]) {
print("NO")
}
}
## [1] "YES"
## [1] "YES"
## [1] "YES"
#Census Tract 28
for(i in 94:96) {
if(chittenden_county$moe[i] < chittenden_county$estimate[i]) {
print("YES")
}
else if(chittenden_county$moe[i] >= chittenden_county$estimate[i]) {
print("NO")
}
}
## [1] "YES"
## [1] "YES"
## [1] "YES"
#Census Tract 31.02
for(i in 110:112) {
if(chittenden_county$moe[i] < chittenden_county$estimate[i]) {
print("YES")
}
else if(chittenden_county$moe[i] >= chittenden_county$estimate[i]) {
print("NO")
}
}
## [1] "NO"
## [1] "YES"
## [1] "YES"
As Census Tract 10 and Census Tract 31.02 possessed MOEs in excess of corresponding estimate counts, both tracts were eliminated from their respective groups, leaving 3 tracts in each group:
Low Vehicle Access Tracts:
High Vehicle Access Tracts:
While the tracts that remain have each met the Estimate > MOE condition, the statistical reliability of their data must be further assessed prior to concluding the selection process.
To do so, the mean coefficients of variation (CV) for selected tracts were calculated according to Census Bureau (2018) guidelines, and results were organized in a new dataframe, cv.
Note: As all tracts in the high vehicle access group had an estimate of 0 for the “No vehicle available” category (resulting in a MOE of 10), mean CV was calculated from the other three variables’ estimate and MOEs for these tracts.
#find CV for selected tracts' variables
#Low
cv24 <- c((((chittenden_county$moe[65] / 1.645) / chittenden_county$estimate[65]) * 100),
(((chittenden_county$moe[66] / 1.645) / chittenden_county$estimate[66]) * 100),
(((chittenden_county$moe[67] / 1.645) / chittenden_county$estimate[67]) * 100),
(((chittenden_county$moe[68] / 1.645) / chittenden_county$estimate[68]) * 100)
)
cv3 <- c((((chittenden_county$moe[9] / 1.645) / chittenden_county$estimate[9]) * 100),
(((chittenden_county$moe[10] / 1.645) / chittenden_county$estimate[10]) * 100),
(((chittenden_county$moe[11] / 1.645) / chittenden_county$estimate[11]) * 100),
(((chittenden_county$moe[12] / 1.645) / chittenden_county$estimate[12]) * 100)
)
cv42 <- c((((chittenden_county$moe[157] / 1.645) / chittenden_county$estimate[157]) * 100),
(((chittenden_county$moe[158] / 1.645) / chittenden_county$estimate[158]) * 100),
(((chittenden_county$moe[159] / 1.645) / chittenden_county$estimate[159]) * 100),
(((chittenden_county$moe[160] / 1.645) / chittenden_county$estimate[160]) * 100)
)
#High (w/o no_vehicles (where estimate = 0))
cv21.01 <- c((((chittenden_county$moe[34] / 1.645) / chittenden_county$estimate[34]) * 100),
(((chittenden_county$moe[35] / 1.645) / chittenden_county$estimate[35]) * 100),
(((chittenden_county$moe[36] / 1.645) / chittenden_county$estimate[36]) * 100)
)
cv23.01 <- c((((chittenden_county$moe[54] / 1.645) / chittenden_county$estimate[54]) * 100),
(((chittenden_county$moe[55] / 1.645) / chittenden_county$estimate[55]) * 100),
(((chittenden_county$moe[56] / 1.645) / chittenden_county$estimate[56]) * 100)
)
cv28 <- c((((chittenden_county$moe[94] / 1.645) / chittenden_county$estimate[94]) * 100),
(((chittenden_county$moe[95] / 1.645) / chittenden_county$estimate[95]) * 100),
(((chittenden_county$moe[96] / 1.645) / chittenden_county$estimate[96]) * 100)
)
#group CV means for selected tracts
cv_low <- c(mean(cv24), mean(cv3), mean(cv42))
cv_high <- c(mean(cv21.01), mean(cv23.01), mean(cv28))
#create dataframe for selected tracts' CVs
cv <- data.frame()
for(i in 1:6){ if(i < 4){
cv[i,1] = cv_low[i]
} else if(i >= 4){
cv[i,1] = cv_high[i-3]
}
}
#add tract names, name columns
cv <- cv %>%
mutate(NAME = c("Census Tract 24; Chittenden County; Vermont", "Census Tract 3; Chittenden County; Vermont", "Census Tract 42; Chittenden County; Vermont", "Census Tract 21.01; Chittenden County; Vermont", "Census Tract 23.01; Chittenden County; Vermont", "Census Tract 28; Chittenden County; Vermont")) %>%
rename("CV" = V1)
#reorder columns
cv <- cv[,c(2,1)]
| NAME | CV |
|---|---|
| Census Tract 24; Chittenden County; Vermont | 27.15444 |
| Census Tract 3; Chittenden County; Vermont | 25.76912 |
| Census Tract 42; Chittenden County; Vermont | 23.94867 |
| Census Tract 21.01; Chittenden County; Vermont | 28.52932 |
| Census Tract 23.01; Chittenden County; Vermont | 30.80666 |
| Census Tract 28; Chittenden County; Vermont | 19.89782 |
Mean CVs for selected tracts can now be compared with CCRPC (2018) guidelines for assessing the statistical reliability of ACS data, which are as follows:
Although the mean CV for Census Tract 23.01 slightly exceeds 30%, thus falling into the “Low Reliability” category, mean CVs for selected tracts are generally between 15% and 30%. This indicates a Medium to Medium-Low level of statistical reliability, which given Vermont’s low population density (and therefore relatively high MOE for ACS data), is usable in the context of this project’s Census Tract Selection phase.
As noted by the CCRPC (2018), the American Community Survey provides estimates that reflect a community’s social and economic conditions. An estimate is “NOT an official count of the population nor is it a point in time count” (CCRPC, 2018). Therefore, it is extremely important to consider MOE when using ACS data to inform decision making.
To reflect this inherent uncertainty in the data, I initially sought to calculate wAvg in a manner that accounts for MOE. To do so, I found each tract’s wAvg using estimates alone, working as if the data were exact. Then, I tried to find each tract’s wAvg with MOE (wAvg_MOE) added to estimates prior to weighting and summing. By finding the differences between wAvg_MOE and wAvg, I was attempting to determine the MOE—correctly propagated—for wAvg, with which a vertical error bar could be displayed on a graph plotting PCT_0 and wAvg.
Example:
#weighted average (+ moe)
#calculate sum of count and moe for each variable
#for no_vehicles
for(i in 1:40) {
moe[i,7] = chittenden_county_long_moe[i,3] + moe[i,3]
}
colnames(moe)[7] <- "CountMOE_Sum_0"
#for one_vehicle
for(i in 1:40) {
moe[i,8] = chittenden_county_long_moe[i,4] + moe[i,4]
}
colnames(moe)[8] <- "CountMOE_Sum_1"
#for two_vehicles
for(i in 1:40) {
moe[i,9] = chittenden_county_long_moe[i,5] + moe[i,5]
}
colnames(moe)[9] <- "CountMOE_Sum_2"
#for three_plus_vehicles
for(i in 1:40) {
moe[i,10] = chittenden_county_long_moe[i,6] + moe[i,6]
}
colnames(moe)[10] <- "CountMOE_Sum_3+"
#calculate sum of countsum and moesum (count + moe rowsums)
countMOE_rowsum <-rowSums(moe[,7:10])
#find weighted average (+ moe)
for (i in 1:40) {
weighted_avg[i,2] = ((moe[i,7] / countMOE_rowsum[i])*0) + ((moe[i,8] / countMOE_rowsum[i])*1) + ((moe[i,9] / countMOE_rowsum[i])*2) + ((moe[i,10] / countMOE_rowsum[i])*3)
}
#find difference between weighted average and weighted average (+ moe)
for (i in 1:40) {
weighted_avg[i,3] = (weighted_avg[i,2] - weighted_avg[i,1])
}
However, a significant proportion of these differences were a negative value:
As MOEs had been added to estimates, not subtracted, this indicated an issue with either my approach or my data.
Upon examining the original data for tract where wAvg_MOE - wAvg were most negative in value, I found that the majority of tracts with negative differences contained a variable whose MOE exceeded its estimate. Vermont’s low population density, even within its most densely populated county, could be a significant factor in explaining why these MOEs were so large.
The relationship between estimate size and its corresponding MOE in the data can be observed in the graph below:
Compatible with the distribution of differences shown earlier, I determined that 21.875% of estimates had MOEs greater than their value. Further, these instances tended to be widely spread among tracts, not isolated in a problematic few.
Because I discovered this issue towards the end of the time I had allotted for the Census Tract Selection phase, as well as the lack of suitable alternative ACS data, I decided to proceed with the data I had already collected while making a few changes to my approach.
Rather than incoporating MOEs into the wAvg variable, I chose to ignore MOE until Vehicle_Access was calculated and tracts in the upper and lower 10% of the distribution were identified. Then, I evaluated the MOE for both groups’ tracts by comparing their mean CVs with CCRPC guidelines, as detailed in the “Selecting Tracts” section above.
In this way, I was able to make informed (accounting for MOE) selections of Census Tracts while maintaining my original project timeline.
Chittenden County Regional Planning Commission (CCRPC). (2018). Best Practices for Reporting American Community Survey in Municipal Planning. https://www.ccrpcvt.org/wp-content/uploads/2018/10/ACS_Guide_Final_20181003.pdf
Fuller, S. & U.S. Census Bureau. (2016). Using ACS Estimates and Margins of Error. https://www.census.gov/content/dam/Census/programs-surveys/acs/guidance/training-presentations/2016_MOE_Slides_01.pdf
U.S. Census Bureau. (2018). 8. Calculating Measures of Error for Derived Estimates. https://www.census.gov/content/dam/Census/library/publications/2018/acs/acs_general_handbook_2018_ch08.pdf
U.S. Census Bureau. (2022). Means of Transportation to Work by Vehicles Available. American Community Survey, ACS 5-Year Estimates Detailed Tables, Table B08141. Retrieved June 17, 2024, from https://data.census.gov/table/ACSDT5Y2022.B08141?q=B08141: MEANS OF TRANSPORTATION TO WORK BY VEHICLES AVAILABLE&g=050XX00US50007$1400000.